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Abstract — We consider three types of application layer coding 
for streaming over lossy links: random linear coding, systematic 
random linear coding, and structured coding. The file being 
streamed is divided into sub-blocks (generations). Code symbols 
are formed by combining data belonging to the same gener- 
ation, and transmitted in a round-robin fashion. We compare 
the schemes based on delivery packet count, net throughput, 
and energy consumption for a range of generation sizes. We 
determine these performance measures both analytically and in 
an experimental configuration. We find our analytical predictions 
to match the experimental results. We show that coding at the 
application layer brings about a significant increase in net data 
throughput, and thereby reduction in energy consumption due 
to reduced communication time. On the other hand, on devices 
with constrained computing resources, heavy coding operations 
cause packet drops in higher layers and negatively affect the 
net throughput. We find from our experimental results that low- 
rate MDS codes are best for small generation sizes, whereas 
systematic random linear coding has the best net throughput 
and lowest energy consumption for larger generation sizes due 
to its low decoding complexity. 

I. Introduction 

With the rapid increase in multicast streaming applications, 
we see more and more proposals for application-layer rateless 
erasure coding. A number of these schemes have already 
been standardized and are currently being considered for 
implementation, such as Raptor codes 0] for Multimedia 
Broadcast/Multicast Service (MBMS) . The goal of these 
schemes is to combat transport-layer packet losses. 

In packet-based data networks, large files are usually seg- 
mented into smaller blocks that are put into transport packets. 
Packet losses occur not only because of the physical channel 
limitations between the sender and the receiver, but also when 
the sender pushes data at rates that exceed the speed at which 
the receiver can take in packets, given its limited processing 
power and buffer space. In point-to-point scenarios, the sender 
can adjust its transmission rate to avoid packet losses, and 
retransmit lost packets according to the feedback from the 
receiver, to ensure efficient and reliable data delivery. Thus 
unicast applications usually implement some ARQ protocol. In 
broadcast applications from a single sender to many receivers, 
however, it is costly for the sender to collect and respond 
to individual receiver feedbacks, and thus packet losses are 
inevitable. 

We here consider three rateless coding schemes to combat 
random packet losses in single-hop scenarios. Two are based 



on random linear coding, and the third is based on structured 
MDS coding such as Reed-Solomon (RS). Note that our 
scenario of interest is wireless streaming, rather than trans- 
mission over networks as in random linear network coding 
J5] over generations [4]. All schemes follow the round-robin 
scheduling. Since there is no feedback until the entire file has 
been downloaded, the round-robin protocol may result in many 
superfluous transmissions for already decoded generations. We 
study the schemes both theoretically and by experiment. 

This paper is organized as follows: In Section [II] we 
introduce our coding and scheduling models and define our 
performance measures. In Section [Ell] we present an analytical 
analysis of the schemes. In Section IIV1 we describe the 
experimental setup, and present measurement results collected 
on a mobile platform. In Section [V] we discuss the results and 
future work. 

II. System Model 

We consider transmission without feedback over a mem- 
oryless binary erasure channel between the sender and the 
receivers. In a packet network, the erasure rate is evaluated as 
the packet loss rate, denoted as e. For the theoretical analysis, 
we assume that e stays constant, regardless of time and the 
transmission protocol. 

A. Performance Measures 

We will measure the performance of the system by the 
delivery packet count, delivery time, and energy consumption. 
Delivery packet count is defined as the number of packets 
that have to be sent until the receiver is able to recover the 
entire file. Delivery time is the time the receiver has to spend 
in the system until it is able to recover the content. It is a 
random variable that depends on the delivery packet count, 
the packet size, and the rate of data transmission. In a wireless 
network, energy consumption mainly depends on the delivery 
time and the transmission power, since the power consumption 
in transmission is dominating. 

B. Coding within Generations 

Suppose a file is segmented into N blocks for transmission. 
A block fits into the data payload of an application layer 
packet. Throughout the paper, we use the words "block" and 
"packet" interchangeably. To combat random packet losses, 
we apply erasure codes at the block level. That is, instead 



of transmitting the original file blocks, the sender transmits a 
coded block formed from the original blocks. The coded block 
contains the same number of information bits contained in an 
original file block. But due to practical concerns, such as com- 
putational complexity, when N is large, the erasure codes are 
applied to subsets of the TV blocks. These subsets are referred 
to as generations, and each file block belongs to at least one 
of the generations. Suppose there are n generations, denoted 
by G\ 1 G2, ■ ■ ■ , G n , and assume a uniform generation size of 
g. Note that when g = 1, the coded blocks are effectively the 
original blocks. In each transmission, the sender selects one 
of the n generations, and sends a coded block composed from 
the selected generation. A transmission scheme is therefore 
defined by three aspects: the composition of generations, the 
encoding scheme of blocks within each generation, and the 
order of selecting generations whence a coded block is created. 
The last component is referred to as generation scheduling. 
When the generation size g = 1, it is simply the question 
of which block to send in each transmission. At the receiver, 
coded blocks are classified by their originating generations, 
and decoding is performed within each generation. 

In (5), the delivery time of coding within both disjoint and 
overlapping generations has been studied when generations 
are scheduled at random and when coded blocks are random 
linear combinations over a finite field. In this paper we discuss 
selecting generations in a round-robin fashion: send one coded 
packet from each generation sequentially and wrap around. 
As for the encoding scheme within the generations, we study 
three schemes: (1) the random linear combination approach 
as in [21, (2) the random linear combination approach with a 
systematic phase, and (3) using an MDS (maximum distance 
separable) erasure code. 

1) Random Linear Combinations over GF(q) (RL): Each 
block is represented as a row vector of symbols from finite 
field GF(q), and the whole file is represented as a matrix of 
N rows, one block each row. We abuse the notation a little 
here to use Gj to denote the matrix representing generation 
Gj. To generate a coded block from a generation Gj of g 
blocks, choose a coding vector c = [ci , C2, . . . , c g ] by choosing 
g symbols independently and equiprobably from GF(g). The 
resulting coded block is then c ■ Gj. 

2) Random Linear Combinations Including a Systematic 
Phase (RLS): This is a variation of the RL scheme that 
includes a systematic phase at the beginning: send the original 
blocks from A to Z before starting to send random linear 
combinations of the original file blocks. 

3) Maximum Distance Separable Codes (MDS): Over a 
finite field of small size, such as the common binary field, 
random linear combinations chosen in the way specified in 
the RL scheme inevitably introduces non-negligible linear 
dependency between the coded packets. For short lengths 
of data, we can use low-rate MDS codes instead. With an 
MDS{K ,g) code, g packets are encoded into K coded packets, 
and all the g packets are recoverable as soon as any g of 
the K distinct coded packets have been collected. To extend 
transmission after the sender has exhausted all the K coded 



packets, the sender repeats the coded packets in a round-robin 
fashion. The parity check code is a binary MDS code where 
K = g + 1. Reed-Solomon codes are another important class 
of MDS codes that operate on GF(2 ; ) with g < K < 2 l . The 
increased complexity that comes with operations on a finite 
field of large size, however, can possibly undo the benefit 
brought by the MDS property, as we will later show in our 
experimental results. 

III. Statistics of the Delivery Packet Count 

In this section, we characterize T, the delivery packet count 
of coding within disjoint generations following the round-robin 
generation scheduling scheme. We assume that in each round, 
one coded packet is created from a generation that is selected 
from the n generations in a wrap-around fashion. After the fth 
transmission, mj(= LV n J) rounds have been completed. By 
that time, (m t + 1) packets will have been sent from each of 
the first [t — m t n] generations, and m t packets from each of 
the rest [(m t + l)n — t] generations. 

Since the generations are disjoint, each generation is de- 
coded independently. Let M g>e be the number of coded packets 
needed to be sent over a link of packet erasure rate e from 
a generation of size g so that the receiver can decode all 
file packets in the generation. Let p m ,g,e be the probability 
that M 9i£ < m. Let p t be the probability that T < t. Then, 

Pt = Pm t +l,s,ePm7£, e > where m t = W n l and n = t - m t n. 

Note that since p m ,g,e is the cumulative probability function of 
Mg, e , Pm,g,e is non-decreasing in m, and hence p t is bounded 
as follows: 

(1) 



Pm t ,g,e — Pt < Pm t + l,g,e 



Hence, 
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(Pm,g,e) n ) < E[T] < n £] (1 - {p m ,g,eT)- (3) 
m— 1 m— 

In the following, we characterize p m ,g,e for different coding 
schemes within each generation. 

A. RL Scheme 

In this scheme, each coded packet is statistically the same; 
it is simply a random linear combination of the source packets. 
To decode a generation of size g, a number g of linearly 
independent coded packets must be received. When m coded 
packets have been transmitted over the channel with erasure 
rate e, some j > g have to be received, and among them 
g have to be linearly independent. Therefore, the probability 
Prng e °f successful decoding, given m > g coded packets 
have been transmitted is given as follows: 

Claim 1: 
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The product in the equation is the probability that a j x g matrix 
with random entries chosen independently and equiprobably 
from GF(g) is of full column rank g. It is equal to the 
probability of having g linearly independent coded packets 
among j coded packets. We can lower bound this product as 
follows (see JU, Lemma 7): 
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if q = 2 and g = j; 
otherwise. 
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When q is large, we can further approximate p m 
follows: 
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B. /?L5 Scheme 



This scheme consists of two phases. In the first (systematic) 
phase, only uncoded packets are sent, and the second phase 
is the same as the RL scheme described above. For each 
generation, first each of the original g packets is transmitted 
once, and random linear combinations of all the packets 
afterwards. Therefore, after m transmissions, a generation can 
be decoded if I packets are received during the first phase 
of g transmissions, and g — I coded packets that are linearly 
independent of the first I are received in the second phase 
of m — g transmissions. The probability p^ s g e of successful 
decoding, given m > g coded packets have been transmitted 
over the channel with the erasure rate e is given as follows: 

The probability of receiving h linearly independent packets 
from m transmissions is 

Claim 2: 
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Proof: Please refer to the appendix. ■ 

C MDS Scheme 

Suppose we use an MDS code which encodes g symbols 
into K symbols s.t. the g symbols can be entirely recovered as 
long as g distinct symbols have been received. We apply the 
code to generate K encoded packets from g original packets, 
and transmit the K encoded packets in a round-robin fashion. 

Let u m = \J^\ and v m = m — u m K be the quotient and 
the remainder of the number of transmissions m divided by 
the code block length K. Then, after m transmissions, the first 
v m of the K encoded packets have been transmitted u m + 1 
times and the last K — v m of the K encoded packets have 



been transmitted u m times. The probability that an encoded 
packet has been received is then 1 — for any packet 

among the first v m , and 1 — e Um among the last K — v m . 
The probability S£ °f successful decoding of all the g 
packets, given m encoded packets have been transmitted, is 
equal to the probability that at least g of the K encoded packets 
have been received, or at most K — g encoded packets have 
never been received. Therefore, p^k g e can ^ e computed by 
summing up the probability that / of the first v m packets are 
absent and j of the remaining K — v m packets are absent 
in the receiver collection for all integers I and j satisfying 
< I + j < K - g. 
Claim 3: 
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where u m — \J^\ , v m = m — u m K, and (?) = for b > a 
When m < K, u m — 0, v m — m, becomes 
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When K = g, the code is the repetition code, and the right 
hand side of © becomes (1 - e u ™ +1 ) v ™(l - e ^K-v m _ 

IV. Numerical and Experimental Results 

To evaluate the performance of the schemes discussed in the 
previous section, we implemented them on an experimental 
platform consisting of a laptop computer and a smartphone. 
We measure the time and the energy consumption required 
for the receiver to recover the whole file. In this section, the 
experimental results are presented along with the theoretical 
predictions. 

A. Experimental Setup 

The experimental setup consists of an HP Pavilion dv5- 
1120eg laptop computer as a transmitter and a Nokia N8 
smartphone as a receiver. The specifications for the Nokia N8 
are shown in Table |T] 

TABLE I 
Specifications of the Nokia N8 



Operating System 


Symbian"3 


CPU 


ARM11 @ 1 GHz 


Memory 


256 MB SDRAM 


Display 


640 x 360 pixels, 3.5 inch 


Battery 


BL-4D (3.7 V, 1200 mAh Li-Ion) 



Both the laptop and the smartphone runs the same native 
C++ application (in sender and receiver mode, respectively) 
implemented using the Qt cross-platform application frame- 
work. The laptop transmits a file at a nominal application-layer 
data rate of lOOOKB/s via UDP and using IEEE 802.11b at a 
physical layer rate of 1 1 Mbps. A transmitted file consists of 




Fig. 1. Predicted expected delivery packet count (number of transmitted 
packets required to recover the entire file) versus generation size (assuming 
packet loss rate e = 0.15). RL: Random linear combinations. RLS: Random 
linear combinations with a systematic phase. RS(255,g): Reed-Solomon codes. 
PC(g + l,g): A systematic code with a single coded packet as the bit-by-bit 
xor-sum of all file packets. 

512 random packets having 1400 data bytes each. These data 
packets are encoded following the three encoding schemes 
described in Section [TT] The receiving cell phone tries to 
decode the original file without sending any feedback infor- 
mation to the sender except for a final completion indicator 
transmitted only when the file is fully decoded. The sender 
stops transmission once it has received this completion signal. 

During the measurements the following information is 
recorded: 

1) Number of packets sent before receiving the completion 
signal. 

2) Number of packets received before sending the comple- 
tion signal. 

3) Time elapsed from the time when the first packet is 
received to the completion time. 

4) Energy consumption by the receiver during the elapsed 
time. The test application uses the Control API of the 
Nokia Energy Profiler Q to programmatic ally monitor 
(and record) the energy consumption of the mobile 
phone. The margin of error for these energy readings 
is 3%. 

Each test was repeated 100 times for each generation size 
and encoding scheme pair. The following section presents the 
experimental results observed. 

B. Results 

Figure Q] shows theoretical predictions for the normalized 
delivery packet count (i.e. how many packets are needed to 
successfully deliver one packet) under typical channel condi- 
tions in our experimental setup. The predictions are calculated 
from (O where p m .g.t is obtained from (|4j, (|6), or (|7). We 
observed that the packet loss rate (e) is around 15% on an 



Fig. 2. Measured delivery packet count versus generation size 

idle receiver when the sender is transmitting at a nominal 
rate of lOOOKB/s. The RL and RLS schemes encode over the 
binary field, and the MDS schemes are represented by a Reed- 
Solomon (RS) code (n = 255, K = g) and a simple Parity 
Check (PC) code (n = g + 1, K = g) that has one parity 
symbol (all original symbols XORed together). We observe 
that the overhead per packet drops as the generation size 
increases, and thus the probability of transmitting a packet for 
an already decoded generation decreases. This is not true for 
PC(g + l,g) that can only cope with very low packet loss rates. 
The incorporation of a systematic phase in the random linear 
combination approach helps to reduce overhead for small 
generation sizes, but the gap quickly closes as the generation 
size increases. The Reed-Solomon code curve is near optimal 
since the code rates we use are much lower (R < 0.51) 
than the packet loss rate, and with a high probability the 
transmission finishes before the sender runs out of the 255 
coded packets for each generation. 

Figure [2] shows the average number of packets sent per 
successfully delivered packet as measured in our experiments. 
This was calculated using the total number of packets sent and 
received divided by the number of packets in the test file (i.e. 
512). For small generation sizes, we observe that increasing 
the generation size lowers the overhead per packet. These 
values are in accordance with the predictions in Figure Q] We 
would expect this trend to continue, since ideally we would 
use a single generation for the entire file. This would eliminate 
the possibility of transmitting packets that belong to an already 
decoded generation. However, this is not the optimal strategy 
in practice due to the increasing computational complexity. 
Figure [2] shows that the overhead per sent packet increases 
significantly for the RS(255,g) scheme when g > 16, and for 
the RL scheme when g > 64. This indicates that the computa- 
tional load on the receivers was too high, and they were unable 
to keep up with the transmission rate of the sender. The offset 
between the RS(255,g) and RL scheme may be explained by 
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the larger field size used by the RS(255,g) scheme. Utilizing 
large fields (e.g. q = 2 8 ) typically requires some form of 
memory based look-up table to perform multiplication and 
division, whereas all operations in the binary field (q = 2) 
may be implemented using CPU instructions for binary XOR 
and AND operations. The RS implementation was based on 
a non-systematic Vandermonde matrix, other approaches such 
as utilizing binary Cauchy matrices [8| should be considered 
to further increase the performance of this implementation. 

The lower computational requirements associated with the 
systematic packets in the RLS scheme clearly benefit the 
overall system performance. It is however worth noting that 
the systematic phase assumes that the receivers did not receive 
any packets previously. The systematic phase might lead to 
an additional overhead if the state of the receivers is initially 
unknown. The curve of the RLS scheme only deviates from 
the predicted values for very high generation sizes, 256 and 
512. 

Figure |3] shows that when we plug the average (application- 
layer) packet loss rate observed from the experiments (the loss 
rates are higher for larger generation sizes) into Claims 1-3, 
the theoretical predictions still match experimental data. This 
confirms the validity of our characterization. 

The energy consumption of the communication system 
is especially important on battery-driven mobile devices. In 
Figure @] we show the average energy consumption in Joules 
per file download. This is compared to the net throughput 
observed throughout the test. Due to the dominant impact of 
the wireless radio on the power consumption, we observe a 
significant connection between these two measured quantities. 
As the power consumption of the wireless radio remains 
relatively stable, when not in power-save or sleep mode, the 
energy consumption largely depends on the time needed to 
complete a test, and thus it is inversely proportional to the 
net rate. In order to minimize the energy consumption of the 



protocol, we need to maximize the net rate. 

When comparing the RS(255,g) to the binary-field RL and 
RLS codes, we observe that using a larger field size yields a 
better code performance at lower generation sizes. On the other 
hand, it is unable to sustain the low overhead as the generation 
size and thereby the computational complexity increases. 

Although these results and the specific optimal values are 
certainly device- and system-dependent, we expect that other 
devices would exhibit similar tendencies, but the actual values 
would be shifted depending on the capabilities of the given 
platform. Faster devices might be able to support higher 
generation sizes and higher data rates. 

V. Conclusion and Future Work 

In this paper, we considered three application-layer coding 
schemes for streaming over lossy links: random linear coding 
(RL), systematic random linear coding (RLS), and structured 
coding (MDS). We characterized the exact distribution and 
the expected value of the delivery packet count of coding 
within disjoint generations following the round-robin gener- 
ation scheduling scheme, taking into account the effect of 
field size and generation size. Our characterization matches 
experimental results. 

The three coding schemes were implemented on a laptop 
computer and a Nokia N8 smartphone using the Qt cross- 
platform application framework. We presented measurement 
results collected during numerous experiments with various 
settings. Results show that the computational complexity has 
a significant impact on the performance of these schemes. The 
RLS scheme is the least computationally intensive, thereby it 
is able to achieve the highest net data rate and the lowest 
energy consumption. 

In the future, we plan to implement other codes such as LT 
codes, Raptor codes, and systematic Reed-Solomon codes on 
the same testbed in order to compare their performance to the 



coding schemes discussed in this paper. The cost of random 
memory access and finite field operations is non-negligible 
on a terminal with constrained capacity. A model should be 
devised that can account for these factors to give predictions 
on other platforms with different capabilities and constraints. 
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Proof of Claim[2] 
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